214 ◾ Bioinformatics
conditions were found to be linked to epigenetic mechanisms. An epigenome consists of
all epigenetic modifications, with the genome of an organism, that regulate the activity
(expression) of the genes; these modifications can be passed down to an organism’s off-
spring [1]. Epigenomics is the study of the complete set of epigenetic modifications on the
genome of an organism. Researchers use chromatin immunoprecipitation (ChIP) to exam-
ine the interactions between epigenetic components (proteins and DNA) and profiling of
DNA methylations in their original context. ChIP are used to identify specific genes and
sequences where a protein of interest binds, across the entire genome, providing critical
information about their regulatory functions and mechanisms. The laboratory protocol
of the ChIP includes fixation of in-vivo chromatin-bound proteins with formaldehyde to
stabilize the protein on the chromatin, and then sonication or restriction enzymes are
used to cut chromatins into short random fragments usually around 200 bp. ChIP can be
performed without formaldehyde crosslinking by digesting chromatins with micrococcal
nuclease, which is an enzyme that can break chromatins into the desired fragment size
(Native ChIP). Antibodies specific for the proteins of interest are added to the short chro-
matin fragment. These enzymes form immunoprecipitated DNA–protein–antibody com-
plexes that can be separated from the non-immunoprecipitated chromatin (DNA without
protein of interest) using beads. The cross-linked formaldehyde is then removed either by
heating or by digesting the protein component of the chromatins. In the final step, only
the DNA fragments, to which proteins of interest were bound, are isolated and purified.
Up to this step, we would have the DNA fragments that were the target for the epigenetic
modification. The next step is to characterize these fragments by identifying the sequences
of the binding sites and the affected genes and that provides important information about
the binding sites of the transcription factors (TFs), function and regulation of the genes,
and the impact of the activities of the genes on the condition studied.
6.2 CHIP SEQUENCING
Researchers use different techniques to characterize the ChIP purified DNA fragments.
Northern blot, polymerase chain reaction (PCR), and microarray are some of the methods.
However, only recently, sequencing these fragments with high-throughput methods has
become the most commonly used and effective method for studying epigenetic modifica-
tions. The technique of sequencing the DNA fragments isolated from immunoprecipita-
tion is called ChIP-Seq.
Highly specific antibodies to the targeted proteins are used for the ChIP-Seq so that only
the DNA fragments affected by the epigenetic modifications are isolated. The ChIP-Seq
also requires control DNA fragments extracted from the same samples but from the DNA
regions that were not affected by the epigenetic modification (non-immunoprecipitated
chromatin fragments) or the input DNA purified from the fragmented chromatin before
the antibody incubation step. The control DNA serves as a baseline to normalize the ChIP
data. Normalization with control DNA data can reduce false positives originated from
biases that cause overrepresentation of reads. Possible source of bias includes the nonuni-
form fragmentation during sonication (sonication bias), PCR amplification which tends to
over-amplify GC-rich regions (PCR bias), sequencing bias, and mapping bias.